Kevin Buchin* 



Trajectory Grouping Structure 

Maike Buchin* Marc van Kreveld§ Bettina Speckmann* 



Frank Staals § 



Abstract 

The collective motion of a set of moving entities like people, birds, or other animals, is characterized 
by groups arising, merging, splitting, and ending. Given the trajectories of these entities, we define 
and model a structure that captures all of such changes using the Reeb graph, a concept from topol- 
ogy. The trajectory grouping structure has three natural parameters that allow more global views of 
the data in group size, group duration, and entity inter-distance. We prove complexity bounds on the 
maximum number of maximal groups that can be present, and give algorithms to compute the group- 
ing structure efficiently. We also study how the trajectory grouping structure can be made robust, 
that is, how brief interruptions of groups can be disregarded in the global structure, adding a notion 
of persistence to the structure. Furthermore, we showcase the results of experiments using data gen- 
erated by the NetLogo flocking model and from the Starkey project. The Starkey data describe the 
movement of elk, deer, and cattle. Although there is no ground truth for the grouping structure in 
this data, the experiments show that the trajectory grouping structure is plausible and has the desired 
effects when changing the essential parameters. Our research provides the first complete study of 
trajectory group evolvement, including combinatorial, algorithmic, and experimental results. 
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1 Introduction 



In recent years there has been an increase in location-aware devices and wireless communication net- 
works. This has led to a large amount of trajectory data capturing the movement of animals, vehicles, 
and people. The increase in trajectory data goes hand in hand with an increasing demand for techniques 
and tools to analyze them, for example, in transportation sciences, sports, ecology, and social services. 

An important task is the analysis of movement patterns. In particular, given a set of moving entities 
we wish to determine when and which subsets of entities travel together. When a sufficiently large set 
of entities travels together for a sufficiently long time, we call such a set a group (we give a more formal 
definition later). Groups may start, end, split and merge with other groups. Apart from the question what 
the current groups are, we also want to know which splits and merges led to the current groups, when 
they happened, and which groups they involved. We wish to capture this group change information in a 
model that we call the trajectory grouping structure. 

The informal definition above suggests that three parameters are needed to define groups: (i) a spatial 
parameter for the distance between entities; (ii) a temporal parameter for the duration of a group; (iii) a 
count for the number of entities in a group. We will design our grouping structure definition to incor- 
porate these parameters so that we can study grouping at different scales. We use the three parameters 
as follows: a small spatial parameter implies we are interested only in spatially close groups, a large 
temporal parameter implies we are interested only in long-lasting groups, and a large count implies we 
are interested only in large groups. By adjusting the parameters suitably, we can obtain more detailed or 
more generalized views of the trajectory grouping structure. 

The use of scale parameters and the fact that the grouping structure changes at discrete events sug- 
gest the use of computational topology [4]. In particular, we use Reeb graphs to capture the grouping 
structure. Reeb graphs have been used extensively in shape analysis and the visualization of scientific 
data (see e.g. [2, 6, 8]). A Reeb graph captures the structure of a two- or higher-dimensional scalar func- 
tion, by considering the evolution of the connected components of the level sets. The computation of 
Reeb graphs has received considerable attention in computational geometry and topology; an overview 
is given in [3]. Recently, a deterministic 0(n log n) time algorithm was presented for constructing the 
Reeb graph of a 2-skeleton of size n [18]. Edelsbrunner et al. [6] discuss time- varying Reeb graphs 
for continuous space-time data. Although we also analyze continuous space-time data (2D-space in our 
case), our Reeb graphs are not time-varying, but time is the parameter that defines the Reeb graph. Ge et 
al. [9] use the Reeb graph to compute a one-dimensional "skeleton" from unorganized data. In contrast to 
our setting, in their applications the data comes without a time component. They use a proximity graph 
on the input points to build a simplicial complex from which they compute the Reeb graph. 

Our research is motivated by and related to previous research on flocks [1, 10, 11, 21], herds [12], 
convoys [14], moving clusters [15], mobile groups [13, 22] and swarms [16]. These concepts differ from 
each other in the way in which space and time are used to test if entities form a group: do the entities stay 
in a single disc or are they density-connected [7], should they stay together during consecutive time steps 
or not, can the group members change over time, etc. Only the herds concept [12] includes the splitting 
and merging of groups. 

Contributions. We present the first complete study of trajectory group evolvement, including com- 
binatorial, algorithmic, and experimental results. Our research differs from and improves on previous 
research in the following ways: Firstly, our model is simpler than herds and thus more intuitive. Sec- 
ondly, we consider the grouping structure at continuous times instead of at discrete steps (which was 
done only for flocks). Thirdly, we analyze the algorithmic and combinatorial aspects of groups and their 
changes. Fourthly, we implemented our algorithms and provide evidence that our model captures the 
grouping structure well and can be computed efficiently. Fifthly, we extend the model to incorporate 
persistence. 

We created videos based on our implementation showing the maximal groups we found in simulated 
NetLogo flocking data [23, 24] and in real-world data from the Starkey project [17]. 
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A Definition for a Group. Let X be a set of entities of which we have locations during some time span. 
The e-disc of an entity 2 (at time £) is a disc of radius e centered at x at time t. Two entities are directly 
connected at time £ if their e-discs overlap. Two entities x and y are e-connected at time £ if there is a 
sequence 2 = xq, .., 2^ = y of entities such that for all i, Xi and Xi + \ are directly connected. 

A subset 5 C ^ of entities is e-connected at time £ if all entities in S are pairwise e-connected at 
time t. This means that the union of the e-discs of entities in S forms a single connected region. The 
set S forms a component at time £ if and only if S is e-connected, and S is maximal with respect to this 
property. The set of components C(t) at time t forms a partition of the entities in X at time £. 

Let the spatial parameter of a group be e, the temporal parameter 5, and the size parameter m. A set 
G of k entities forms a group during time interval I if and only if the following three conditions hold: 
(i) G contains at least m entities, so k > m, (ii) the interval I has length at least 5, and (Hi) at all times 
£ G I, there is a component C E C(£) such that GCC. 



together for a while, then 2/4, 2/5 may become e-connected, and shortly thereafter 2/1,7/4, 2/5 separate and 
travel together for a while. Then y\ may be in two otherwise disjoint maximal groups for a short time. An 
entity can also be in two maximal groups where one is a subset of the other. In that case the group with 
fewer entities must last longer. That an entity is in more groups simultaneously may seem counterintuitive 
at first, but it is necessary to capture all grouping information. We will show that the total number of 
maximal groups is 0(rn 3 ), where n is the number of entities in X and r is the number of edges of each 
input trajectory. This bound is tight in the worst case. 

Our maximal group definition uses three parameters, which all allow a more global view of the group- 
ing structure. In particular, we observe that there is monotonicity in the group size and the duration: If 
G is a group during interval /, and we decrease the minimum required group size m or decrease the 
minimum required duration 5, then G is still a group on time interval I. Also, if G is a maximal group 
on I, then it is also a maximal group for a smaller m or smaller 5. For the spatial parameter e we observe 
monotonicity in a slightly different manner: if G is a group for a given e, then for a larger value of e there 
exists a group G' D G. The monotonicity property is important when we want to have a more detailed 
view of the data: we do not lose maximal groups in a more detailed view. The group may, however, be 
extended in size and/or duration. 

We capture the grouping structure using a Reeb graph of the e-connected components together with 
the set of all maximal groups. Parts of the Reeb graph that do not support a maximal group can be 
omitted. The grouping structure can help us in answering various questions. For example: 

• What is the largest/longest maximal group at time £? 

• How many entities are currently (not) in any maximal group? 

• What is the first maximal group that starts/ends after time t? 

• What is the total time that an entity was part of any maximal group? 

• Which entity has shared maximal groups with the most other entities? 

Furthermore, the grouping structure can be used to partition the trajectories in independent data sets, to 
visualize grouping aspects of the trajectories, and to compare grouping across different data sets. 

We also discuss robustness of the grouping structure in the following sense. If an entity x leaves a 
group G and almost immediately returns, we would like to ignore the small interval on which x and G 
were separate, and just consider G U {x} as one group. The maximal group definition given above is 



We denote the interval I = [t s ,t e ] of group G with Iq. 
Group H covers group G if G C H and Iq C Ijj. If there 
are no groups that cover G, we say G is maximal (on Iq). 
In Fig. 1, groups {xi,x 2 }, G = {x 3j x A }, G = {25, 2 6 }, 
and G = {xi, .., X4} are maximal: G and G on [to, £5], G 
on [£1, £2]. Group {x\, 23} is covered by G and hence not 
maximal. 




Note that entities can be in multiple maximal groups at 
the same time. For example, entities {2/1,2/2,2/3} can travel 



Figure 1: For m = 2 and 5 > £4 — £3 
there are four maximal groups: {x\, X2}, 
{23, 24}, {25, x 6 }, and {21, .., 24}. 
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not robust, but later in the paper we will study an extension that is. Note that robustness requires an 
additional parameter that captures how short any interruption in a group may last to be ignored. 

Results and Organization. We discuss how to represent the grouping structure in Section 2, and prove 
that there are always 0(rn 3 ) maximal groups, which is tight in the worst case. Here n is the number of 
trajectories (entities) and r the number of edges in each trajectory. We present an algorithm to compute 
the trajectory grouping structure and all maximal groups in Section 3. This algorithm runs in 0(tti 3 +N) 
time, where N is the total output size. In Section 4 we make our definitions more robust, and extend our 
algorithms to this case. In Section 5 we evaluate our methods on synthetic and real-world data. 

2 Representing the Grouping Structure 

Let X be a set of n entities, where each entity travels along a path of r edges. To compute the grouping 
structure we consider a manifold M. in IR 3 , where the z-axis corresponds to time. The manifold M. is 
the union of n "tubes" (see Fig. 2(a)). Each tube consists of r skewed cylinders with horizontal radius e 
that we obtain by tracing the e-disc of an entity x over its trajectory. 

Let H t denote the horizontal plane at height t, then the set M. H H t is the level set of t. The connected 
components in the level set of t correspond to the components (maximal sets of e-connected entities) at 
time t. We will assume for simplicity that all trajectories have their known positions at the same times 
to,..,t T and that no three entities become e-(dis)connected at the same time, but most of our theory does 
not depend on these assumptions. 

2.1 The Reeb Graph 

We start out with a possibly disconnected solid that is the union of a collection of tube-like regions: a 3- 
manifold with boundary. Note that this manifold is not explicitly defined. We are interested in horizontal 
cross-sections, and the evolution of the connected components of these cross-sections defines the Reeb 
graph. Note that this is different from the usual Reeb graph that is obtained from the 2-manifold that is 
the boundary of our 3-manifold, using the level sets of the height function (the function whose level sets 
we follow is the height function above a horizontal plane below the manifold), see [4] for a background 
on these topics. 

To describe how the components change over time, we consider the Reeb graph 1Z of A4 (Fig. 2(b)). 
The Reeb graph has a vertex v at every time t v where the components change. The vertex times are 
usually not at any of the given times to, .., t T , but in between two consecutive time steps. The vertices of 
the Reeb graph can be classified in four groups. There is a start vertex for every component at to and an 
end vertex at t T . A start vertex has in-degree zero and out-degree one, and an end vertex has in-degree 
one and out-degree zero. The remaining vertices are either merge vertices or split vertices. Since we 




(a) (b) 
Figure 2: The manifold for the entities X = {x%, .., X5} (a), and the corresponding Reeb graph (b). 
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assume that no three entities become e-(dis)connected at exactly the same time there are no simultaneous 
splits and merges. This means merge vertices have in-degree two and out-degree one, and split vertices 
have in-degree one and out-degree two. A directed edge e = (u, v) connecting vertices u and v, with 
t u < t v , corresponds to a set C e of entities that form a component at any time t G I e = [t u , t v \. The 
Reeb graph is this directed graph. Note that the Reeb graph depends on the spatial parameter e, but not 
on the other two parameters of maximal groups. 

Lemma 1 The Reeb graph 1Z for a set X of n entities, each of which travels along a trajectory of r 
edges, can have ft(Tn 2 ) vertices and Sl(rn 2 ) edges. 

Proof. We construct n trajectory edges on which the entities travel in between two consecutive time 
stamps, say ij and U + \, such that the Reeb graph for e = has fi(n 2 ) vertices v with t v G [U, ij+i]. 
We use this construction in between all times tu and £21+1* and move the entities back to their starting 
position in between t 2 i+i and t 2 i+2- Therefore, the total number of vertices is f2(rra 2 ). Since each vertex 
has degree one or three it follows that the number of edges is also $7(rn 2 ). 



r n /2, 

u 



do* 



Tir- — 
V 



-XX 



+1 



Figure 3: Every pair of entities rj and de are at the same point at time U + j + I. This yields fi(ro 2 ) 
vertices in the interval [ij, U + \\. 



Let X = R U D, with R = r\, .., r n / 2 and D = d\, .., d n _ n / 2 . At the start (time t«) all entities start at 
the line y = x. In particular, we place rj on (— j, — j) and de on (£, I). All entities move with speed one. 
The entities in R move to the right, and the entities in D move downwards (see Fig. 3). It follows that 
each entity rj and d% are both at the same point at time U + j + I. Hence, we get a vertex in the Reeb 
graph. There are 0(n 2 ) such intersections, and thus S7(n 2 ) vertices. The lemma follows. □ 



Theorem 1 Given a set Xofn entities, in which each entity travels along a trajectory of r edges, the 
Reeb graph TZ = (V, E) has 0{rn 2 ) vertices and edges. These bounds are tight in the worst case. 

Proof. Lemma 1 gives a simple construction that shows that the Reeb graph may have Sl(rn 2 ) vertices 
and edges. For the upper bound, consider a trajectory edge (t>j, Vj+i) of (the trajectory of) entity x G X. 
An other entity y G X is directly connected to x during at most one interval I C [ij, t i+1 ]. This interval 
yields at most two vertices in 1Z. The trajectory of x consists of r edges, hence a pair x, y produces 
O(t) vertices in 1Z. This gives a total of 0(rn 2 ) vertices. Each vertex has constant degree, so there are 
0(rn 2 ) edges. □ 



The Trajectory Grouping Structure. The trajectories of entities are associated with the edges of the 
Reeb graph in a natural way. Each entity follows a directed path in the Reeb graph from a start vertex to 
an end vertex. Similarly, (maximal) groups follow a directed path from a start or merge vertex to a split or 
end vertex. If m > or 5 > 0, there may be edges in the Reeb graph with which no group is associated. 
These edges do not contribute to the grouping structure, so we can discard them. The remainder of the 
Reeb graph we call the reduced Reeb graph, which, together with all maximal groups associated with its 
edges, forms the trajectory grouping structure. 
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2.2 Bounding the Number of Maximal Groups 




{1,3}, {1,3, 5, 7} 
1,3,5,7 

2,4,6,8 



to 



t 2 t 3 t± 



To bound the total number of maximal groups, we study 
the case where m = 1 and S = 0, because larger values 
can only reduce the number of maximal groups. It may 
seem as if each vertex in the Reeb graph simply creates as 
many maximal groups as it has outgoing edges. However, 
consider for example Fig. 4. Split vertex v creates not only 
the maximal groups {1,3,5,7} and {2,4,6,8}, but also 
{1,3}, {5,7}, {2,4}, and {6,8}. These last four groups 
are all maximal on \t%,t\, for t > t&. Notice that all six 
newly discovered groups start strictly before t v , but only 
at t v do we realize that these groups are maximal, which 
is the meaning that should be understood with "creating 
maximal groups". This example can be extended to arbitrary size. Hence a vertex v may create many 
new maximal groups, some of which start before t v . We continue to show that we may obtain Sl(rn 3 ) 
maximal groups, and that it cannot get worse than that, that is, the number of maximal groups is at most 
0(rn 3 ) as well. 



Figure 4: The maximal groups containing 
entity 3 (green). Vertex v creates six new 
groups, including {1,3} and {1, 3, 5, 7}. 



Lemma 2 For a set Xofn entities, in which each entity travels along a trajectory of r edges, there can 
be 17(n 3 r) maximal groups. 

Proof. Similar to Lemma 1 we construct n trajectory edges on which the entities travel in between ti 
and t{ + i, and repeat this construction in O(r) time steps. Our construction yields f2(n 3 ) maximal groups 
G with Iq C [ti, ij+i], resulting in U(rn 3 ) maximal groups overall as claimed. 

For ease of notation we assume that n is divisible by four, and we write x to denote both the entity 
x and the e-disc of entity x. We partition our set of entities X into two sets S and D. The entities 
in S = {si, .., s 3n / 4 } are stationary. They all lie on the line y = 0, ordered from left to right, with a 
distance r < 2s in between two consecutive entities. Hence S is e-connected. 

The remaining entities D will move on a horizontal line y = v, for some e < v < 2e. At time ti, the 
discs D = {d\, .., d n u}, ordered from right to left, all lie to the left of the discs in S. They all move to 
the right with the same speed. The distance hi between and di + \ is r + (n/4 — i)fj,, for some small 
\i > 0. Hence, the distances get smaller the further the discs are to the left. See Fig. 5 for an illustration 
of this construction. 



h 3 < h 2 < hi 



"■n/4 

y = v ' 



y = --QBB8BBBBBBH3-- 

Sn/2 S3n/4 



Si 



!>ra/4 



Figure 5: The lower bound construction for n = 16. The black discs correspond to the stationary entities 
in S. The red (grey) discs correspond the entities in D. 



We can choose the exact values for r and v such that the sequence of events can be partitioned into 
rounds. Round i consists of a series of ki merge events followed by a series of fcj split events. In a series 
Ji, .., Jfc of merges the discs d\,..,dk become directly connected with discs in S. Merge J, will start a 
new maximal group Gu, where Gij = S U \Jl =i de. Hence after the k merges, k maximal groups have 
started. In the subsequent series P%, .., of split events, the discs d±, ..,dk stop being directly connected 
with a disc in S. When di leaves, the sets of entities Gu, .., Gik end as maximal groups. However, when 
di leaves Gij, it creates Gu + i\j as a new maximal group that started on Jn+i) (see Fig. 6). This means 
Pi creates k — i new maximal groups. 

We now show that, for any m < 3n/4 and any 5, this construction yields U(n 3 ) maximal groups. 
Since we can choose the speed of the discs in D, we can choose it such that all groups have a minimum 
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G13 ! ' ! 

G14 j 

G22 ! ! ! 

G23 j j j 

G24 ! ! 

J1J2J3J4 PiP 2 P3P± 

Figure 6: The time intervals on which Gij is a maximal group in a given round. 

duration of at least S. Now consider the rounds n/2, .., 3n/4. In each of these rounds we have k = n/4 

merges followed by n/4 splits. The splits in each round create a total of 5^=1 ( n /4 ~~ *) = ^(^ 2 ) new 
maximal groups. Each of these groups contains 5, hence its size is at least 3n/ A. It follows that the total 
number of maximal groups in those n/4 rounds is Q(n 3 ). □ 

Theorem 2 Let X he a set of n entities, in which each entity travels along a trajectory of r edges. There 
are at most 0(rn 3 ) maximal groups, and this is tight in the worst case. 

Proof. Lemma 2 gives a construction that shows that there may be f2(rn 3 ) maximal groups. 

We proceed with the upper bound. Every maximal group starts either at a start vertex, or a merge 
vertex. We will show that the number of maximal groups starting at a start or merge vertex is 0{n). 
Since there are 0(rn 2 ) start and merge vertices the lemma follows. We will discuss only the merge 
vertex case; the proof for a start vertex is the same. 

Let v be a merge vertex, let S C X and T C X be the components that merge at v, and let p x denote 
the path of entity x G S U T through TZ, starting at v. The union over all x of these paths p x forms a 
directed acyclic graph (DAG) 1Z' V , which is a subgraph of TZ (see Fig. 7 (a)). Consider "unraveling" TZ' V 
into a tree T v as follows. If p x and p y split in some vertex u and merge again in vertex w, with t w > t u 
we duplicate the subpath starting at w. This yields a tree T v with root v and at most |5| + \T\ < n leaves. 
Furthermore, all nodes in % have degree at most three (see Fig. 7 (b)). 




Figure 7: DAG 1Z' V (black) as a subgraph of TZ (grey) (a), and the tree % obtained by unfolding 1Z' V (b). 

Since all maximal groups end at either a split or an end vertex, all maximal groups G\ , . . , Gk that start 
at v can now be represented by subpaths in % starting at the root. The path corresponding to a maximal 
group G ends at the first node where two entities x,y G G split, or at a leaf if no such node exists. 
Clearly, paths p x and p y can split only at a degree three node. Since T v has at most n leaves it follows 
there are at most 0(n) degree three nodes. 

Finally, we show that there is at most one maximal group that ends at a given leaf or degree three node 
of T v . Assume by contradiction that and Gj, with i ^ j, both end at node u. Both maximal groups 
share the same path from the root of % to u, so all entities in Gi and Gj are in the same component at 
all times t G I = [t v , t u ] . Hence Gi U Gj is a maximal group on /, contradicting that Gi and Gj were 
maximal. We conclude that the number of maximal groups k that start at v is at most the number of 
leaves plus the number of degree three nodes in T v . Hence k = 0(n). Summing over all 0(rn 2 ) start 
and merge vertices gives 0(rn 3 ) maximal groups in total. □ 
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3 Computing the Grouping Structure 



To compute the grouping structure we need to compute the reduced Reeb graph and the maximal groups. 
We now show how to do this efficiently. Removing the edges of the Reeb graph that are not used is an 
easy post-processing step which we do not discuss further. 

3.1 Computing the Reeb Graph 

We can compute the Reeb graph TZ = (V, E) as follows. We first compute all times where two entities x 
and y are at distance 2e from each other. We distinguish two types of events, connect events at which x 
and y become directly connected, and disconnect events at which x and y stop being directly connected. 

We now process the events on increasing time while maintaining the current components. We do this 
by maintaining a graph G = {X , Z) representing the directly-connected relation, and the connected 
components in this graph. The set of vertices in G is the set of entities. The graph G changes over time: 
at connect events we insert new edges into G, and at disconnect events we remove edges. 

At any given time t, G contains an edge (x, y) if and only if x and y are directly connected at time t. 
Hence the components at t (the maximal sets of e-connected entities) correspond to the connected com- 
ponents in G at time t. Since we know all times at which G changes in advance, we can use the same 
approach as Parsa [18] to maintain the connected components: we assign a weight to each edge in G 
and we represent the connected components using a maximum weight spanning forest. The weight of 
edge (x, y) is equal to the time at which we remove it from G, that is, the time at which x and y become 
directly disconnected. We store the maximum weight spanning forest F as an ST-tree [19], which allows 
connectivity queries, inserts, and deletes, in O(logn) time. 

We spend 0(n 2 ) time to initialize the graph G at to in a brute-force manner. For each component we 
create a start vertex in TZ. We also initialize a one-to-one mapping M from the current components in 
G to the corresponding vertices in TZ. When we handle a connect event of entities x and y at time t, 
we query F to get the components C x and C y containing x and y, respectively. Using M we locate the 
corresponding vertices v x and v y in TZ. If C x / C y we create a new merge vertex v in TZ with time 
t v = t, add edges (v x , v) and (v y , v) to TZ labeled C x and C y , respectively. If C x = C y we do not change 
1Z. Finally, we add the edge (x, y) to G (which may cause an update to F), and update the mapping M. 

At a disconnect event we first query F to find the component C currently containing x and y. Using 
M we locate the vertex u corresponding to C. Next, we delete the edge (x, y) from G, and again query 
F. Let C x and C y denote the components containing x and y, respectively. If C x = C y we are done, 
meaning x and y wee still e-connected. Otherwise we add a new split vertex v to 1Z with time t v = t, and 
an edge e = (u,v) with C e = C as its component. We update M accordingly. 

Finally, we add an end vertex v for each component C in F with t v = t T . We connect the vertex 
u = M(C) to v by an edge e = (u, v) and let C e = C be its component. 

Analysis. We need 0(rn 2 log n) time to compute all 0(rn 2 ) events and sort them according to increas- 
ing time. To handle an event we query F a constant number of times, and we insert or delete an edge in 
F. These operations all take O(logn) time. So the total time required for building 1Z is 0(rn 2 logn). 

Theorem 3 Given a set Xofn entities, in which each entity travels along a trajectory of r edges, the 
Reeb graph 1Z = (V, E) has 0{rn 2 ) vertices and edges, and can be computed in 0{rn 2 log n) time. 

3.2 Computing the Maximal Groups 

We now show how to compute all maximal groups using the Reeb graph 1Z = (V,E). We will ignore the 
requirements that each maximal group should contain at least m entities and have a minimal duration of 
5. That is, we assume m = 1 and S = 0. It is easy to adapt the algorithm for larger values. 

Labeling the Edges. Our algorithm labels each edge e = (n, v) in the Reeb graph with a set of maximal 
groups Q e . The groups G £ Q e are those groups for which we have discovered that G is a maximal group 
at a time t < t u . Each maximal group G becomes maximal at a vertex, either because a merge vertex 
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created G as a new group that is maximal, or because G is now a maximal set of entities that is still 
together after a split vertex. This means we can compute all maximal groups as follows. 

We traverse the set of vertices of 1Z in topological order. For every vertex v we compute the maximal 
groups on its outgoing edge(s) using the information on its incoming edge(s). 

If v is a start vertex it has one outgoing edge e = (v,u). We set Q e to {(C e ,t v )} where t v = to. 
If v is a merge vertex it has two incoming edges, e\ and ei. We propagate the maximal groups from 
e\ and e2 on to the outgoing edge e, and we discover (C e ,t v ) as a new maximal group. Hence Q e = 

g ei ug e2 u{(c e ,t v )}. 

If v is a split vertex it has one incoming edge e, 
and two outgoing edges e\ and ei- A maximal group 
Gone may end at v, continue on e± or e%, or spawn 
a new maximal group G' C G on either e\ or e<i- In 
particular, for any group G' in Q ei , there is a group 
G in g e such that G' = G D Q ^ 0. The starting 
time of G" is t' = min{t | (G, t) G £ e A G' C G}. 
Thus, i' is the first time G' was part of a maximal 
group on e. Stated differently, i' is the first time G' 
was in a component on a path to v. Fig. 8 illustrates 
this case. If v is an end vertex it has no outgoing 
edges. So there is nothing to be done. 

Fig. 9 shows a complete example of a Reeb graph after labeling the edges with their maximal groups. 

Storing the Maximal Groups. We need a way to store the maximal groups Q e on an edge e = (u, v) 
in such a way that we can efficiently compute the set(s) of maximal groups on the outgoing edge(s) of 
a vertex v. We now show that we can use a tree T e to represent Q e , with which we can handle a merge 
vertex in 0(1) time, and a split vertex in 0(k) time, where k is the number of entities involved. The tree 
uses O(k) storage. 

We say a group G is a subgroup of a group H if and only if G C H and Ijj C Iq. For example, in 
Fig. 1 {xi, X2} is a subgroup of {x\, .., X4}. Note that both G and H could be maximal. 

Lemma 3 Let e be an edge of 1Z, and let S and T be maximal groups in Q e with starting times tg and 
tr, respectively. There is also a maximal group G 5 S U T on e with starting time Iq > max(t,g, tx), 
and if S flT ^ then S is a subgroup ofT or vice versa. 




Figure 8: After split vertex v, Q ei contains the 
groups G ei = Gi U G2 (with starting time t s ), 
Gi, and G2. Maximal groups G e2 = G3 U G4 
(with starting time t u ), G3, and G4 go to e-i- The 
maximal groups G e and Gi U G2 U G3 end at v. 



Proof. The first statement is almost trivial. Clearly, S, T C G e and hence SUTC G e . Component G e 
itself is also a maximal group on e. By construction G e must have the largest starting time t of the groups 
in Q e . Hence Iq > max(ts, tx). 

We prove the second statement by contradiction: assume SnT ^ 0, and S % T or vice versa. Assume 
w.l.o.g. that t$ < tj>- So the entities in S are all in a single component at all times t > tj> > tg. At any 
time t > Pp all entities in T are also in a single component. Since S n T / this must be the same 
component that contains S. Hence SCT, which together with tg <tp proves the statement. □ 




Figure 9: The maximal groups as computed by our algorithm (a set {i,j, k} is denoted by ij k). 
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We represent the groups Q e on an edge e E E by a tree 7^ (see Fig. 10). We call this the grouping tree. 
Each node v represents a group G v € Q e . The children of a node t> are the largest subgroups of G v . 
From Lemma 3 it follows that any two children of v are disjoint. Hence an entity x € G v occurs in only 
one child of v. Furthermore, note that the starting times are monotonically decreasing on the path from 
the root to a leaf: smaller groups started earlier. A leaf corresponds to a smallest maximal group on e: 
a singleton set with an entity x £ C e . It follows that T e has 0(n) leaves, and therefore has size 0(n). 
Note, however, that the summed sizes of all maximal groups can be quadratic. 

Analysis. We analyze the time required to label each edge e 
with a tree T e for a given Reeb graph 1Z = (V, E). Topologi- 
cally sorting the vertices takes linear time. So the running time 
is determined by the processing time in each vertex, that is, com- 
puting the tree(s) T e on the outgoing edge(s) e of each vertex. 
Start, end, and merge vertices can be handled in O(l) time: start 

and end vertices are trivial, and at a merge vertex v the tree T e is „ ^ „ ™ . r 

... , . .... . . Figure 10: The grouping tree for the 

simply a new root node with time t v and as children the (roots , , . , . _ „ 

. , , , ■ ■ . ,. , edge between t 2 and i 3 m Fig. 9. 

of the) trees of the incoming edges. At a split vertex we have to 

split the tree T = 7} u ,v) °f tne incoming edge (it, v) into two trees for the outgoing edges of v. For this, 
we traverse T in a bottom-up fashion, and for each node, check whether it induces a vertex in one or 
both of the trees after splitting. This algorithm runs in 0(|T|) time. Since \T\ = 0(n) the total running 
time of our algorithm is 0(n| V|) = 0(rn 3 ). 

Reporting the Groups. We can augment our algorithm to report all maximal groups at split and end 
vertices. The main observation is that a maximal group ending at a split vertex v, corresponds exactly 
to a node in the tree T( UyV \ (before the split) that has entities in leaves below it that separate at v. The 
procedures for handling split and end vertices can easily be extended to report the maximal groups of size 
at least m and duration at least 5 by simply checking this for each maximal group. Although the number 
of maximal groups is 0(rn 3 ) (Theorem 2), the summed size of all maximal groups can be f2(rn 4 ). The 
running time of our algorithm is 0(rn 3 + N), where N is the total output size. 

Theorem 4 Given a set Xofn entities, in which each entity travels along a trajectory of r edges, we 
can compute all maximal groups in 0(rn 3 + N) time, where N is the output size. 



4 Robustness 



The grouping structure definition we have given and analyzed has a number of good properties. It fulfills 
monotonicity, and in the previous sections we showed that there are only polynomially many maximal 
groups, which can be computed in polynomial time as well. In this section we study the property of 
robustness, which our definition of grouping structure does not have yet. Intuitively, a robust grouping 
structure ignores short interruptions of groups, as these interruptions may be insignificant at the temporal 
scale at which we are studying the data. For example, if we are interested in groups that have a duration 
of one hour or more, we may want to consider interruptions of a minute or less insignificant. 

We introduce a new temporal parameter a, which is related to the temporal scale at which the data 
is studied. Our robust grouping structure should ignore interruptions of duration at most a. We realize 
this by letting the precise moment of events be irrelevant beyond a value depending on a. Events that 
happen within a time of each other may cancel out, or their order may be exchanged. The objective is 
to incorporate a into our definitions while maintaining the properties that we have for the (non-robust) 
grouping structure. Note that a is another parameter that allows us to obtain more generalized views of 
the grouping structure by increasing its value. Obtaining generalized views in this way is related to the 
concept of persistence in computational topology [4, 5]. 

A possible definition of a robust grouping structure is based on the following intuition: A set of 
entities forms a robust group on I as long as every interval /' C I on which its entities are not in the 
same component has length at most a. More formally: we say G is a robust group on time interval I if 



9 



and only if: (i) G contains at least m entities, (ii) I has length at least 5, and (Hi) for any time t G I there 
is a time t' G [t — a/2, i + a/2] and a component C G C(i') such that G C C. Unfortunately, we can 
show that even determining whether there is a robust group of size k is NP-complete (see Appendix A). 

We consider a second definition for a robust group, which we will use from now on. Two entities 
are a-relaxed directly connected at time t if and only if they are directly connected at some time t' G 
[t — a/2, t + a/2]. Two entities x and y are a-relaxed e-connected at time t if there is a sequence 
x = xo,..,Xj = y such that xi and ccj+i are a-relaxed directly connected. Note that the precise times 
may be different for different pairs x\ and Xi+i, as long as each time is in the interval [t — a/2, i + a/2]. 
A maximal set of a-relaxed e-connected entities at time t is an a-relaxed component, or a-component 
for short. An a-component at time t corresponds to connected 3D-component in a horizontal slice of M. 
with thickness a and centered at t (see Fig. 11). 

A subset G of k entities is a robust group if and only if it 
is a group by the definition in the introduction, but where 
"component" is replaced by "a-component" in condition 
(Hi). This immediately leads to the definition of maximal 
robust groups and a robust grouping structure. The robust 
grouping structure has the property of monotonicity in the 
new parameter a as well. Note that every group which is 
a robust group according to the first definition, is also a robust group according to the second definition. 
For instance, in Fig. 11, entities x±, ..,xq form a component by the second definition, but not by the first. 




t + a/2 



t - a/2 

Figure 11: An a-component at time t. 



4.1 Computation of Maximal Robust Groups 

We can compute all maximal robust groups according to the (second) definition. The idea is to modify 
the Reeb graph to a version that is parametrized by a and captures exactly the robust grouping structure 
for parameter a. 

Let 1Z be the Reeb graph that we used for the grouping structure without considering robustness. Note 
that this is the same as assuming a = in the definition of the robust grouping structure, and we let 
IZq = 1Z. For a > we define the Reeb graph parametrized in 7 as 1Z~ ( by imagining a process that 
changes the Reeb graph for a growing parameter 7, starting with TZq and ending with 1Z Q /2- 

We observe that a new a-component starts at time a/2 before two regular components merge and 
form a new component. Symmetrically, an a-component ends due to a split at time a/2 after a regular 
component splits. Both facts follow from the new definition of a-relaxed directly connected. It implies 
that in the process that maintains 7£ 7 for growing 7, the split nodes move forward in time, zippering to- 
gether the outgoing edges, and the merge nodes move backward in time, zippering together the incoming 
edges. All nodes move at the same rate in 7, which implies that in the process, the only event where the 
Reeb graph changes structurally is when an (earlier) split node encounters a (later) merge node. This can 
happen only if they are endpoints of the same edge of the Reeb graph. The encounter is either a passing 
or a collapse (see Fig. 12). 

Both encounters lead to new edges in the Reeb graph and can thus 
give rise to new encounters when growing 7 further. The collapse en- 
counter reduces the complexity of the Reeb graph: two nodes of de- 
gree 3 disappear and four edges become a single edge. The collapse 
event is exactly the situation where a component splits and merges ^ ^) 

again, so by removing a split-merge pair involving the same entities 

. .. _ , . Figure 12: Passing encounter, 

we ignore the temporary split of a component (or group). 

. . • , • , , j- , ™ , , before and after (a). Collapse 

A passing encounter maintains the complexity of the Reeb graph. ' , 

_, . . . ,. , encounter, before and after (b). 

Before the passing encounter, a part of one group splits and merges 

with a different group. After the passing encounter, the two groups merge (for a short time) and then 
split again. This situation is also captured in Fig. 11. 

Next, we show that there are 0(rn 3 ) encounter events in the Reeb graph of the robust version of the 
trajectory grouping structure, and this bound is tight in the worst case. 
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Lemma 4 For some set Xofn entities, in which each entity travels along a trajectory of r edges, the 
structure of the Reeb graph 7£ 7 of X changes f2(rra 3 ) times when increasing 7 from zero to infinity. 

Proof. We show that there is a set of n trajectories, each consisting of r edges, for which there are 
fi(rn 3 ) encounter events. The lemma then follows. 

We use the same construction as in Lemma 2. So in all time intervals \p2i-, t2i+i] we have a set S of 
3n/4 stationary entities/discs and a set D = {d\, .., d n / 4 } entities, ordered from right to left, that move 
to the right in such a way that di becomes directly (dis)connected with S before di + \ (see Fig. 5). Let 
t a be the first time at which d n / 4 becomes directly connected with S, and let % denote the last time d\ 
becomes directly disconnected with S. We now show that the part of Reeb-graph TZ' corresponding to the 
interval (t a , U) already yields $7(n 3 ) encounter events. We note that no other encounter events involving 
other parts of the Reeb-graph can interfere with the encounter events in 1Z' . 

In between t a and % every disc di becomes directly (dis)connected with S O(n) times. So 1Z'^ initially 
contains of a path P of Q(n 2 ) edges. Each edge has at least the set of entities S associated with it, and 
possibly other entities as well. The vertices on P can be grouped in Q(n) sequences of k = n/4 split 
vertices ui, .., followed by k merge vertices v\, .., v^. At vertex U{ entity Xi splits from S and at Vi 
entity Xi merges with S. See Fig. 13. 




t a h 



Figure 13: The part of the Reeb-graph that yields Q(n 3 ) encounter events (for n = 16). 

By increasing 7 each split vertex m will have a passing encounter with the merge vertices v\, .., Vi-i 
before it collapses with V{. Hence each sequence involves $^ i=1 (i — 1) = Q(n 2 ) encounter events. Since 
there are Q(n) such sequences this gives U(n 3 ) encounter events in a single timestep, and hence 0(rn 3 ) 
in total. □ 

Theorem 5 Let X be a set ofn entities, in which each entity travels along a trajectory of r edges. The 
structure of the Reeb graph 7£ 7 of X changes at most 0(rn 3 ) times when increasing 7 from zero to 
infinity. This bound is tight in the worst case. 

Proof. Lemma 4 gives a construction that shows that there may be f2(rre 3 ) encounters. 

Since each collapse event decreases the number of edges by three it follows the number of collapse 
events is at most 0(rn 2 ). What remains is to prove that the number of passing events is 0(rn 3 ). Each 
passing event involves a split vertex u and a merge vertex v. We now show that there are at most n 
passing events involving a given split vertex u. Since there are 0(rn 2 ) split vertices this means the 
number of passing events is 0(rn 3 ). 

Assume by contradiction that there are k > n passing events involving split vertex u. Let 71, .., 7fc 
be the values for 7 for which these passing events occur in non-decreasing order, and let v%, .., Vk be the 
corresponding merge vertices. Just before u passes V{ the edge e = (u, Vi) is an incoming edge of v%. Let 
Xi denote the set of entities on the other incoming edge of vi, that is the set of entities that merges with 
C e at vertex v\ (see Fig. 14(a)). 

Since k > n there must be an entity x that u "passes" at least twice. That is, u passes V{ and Vj, with 
i < j, and x G Xi and x G Xj. Now consider the Reeb-graph IZy just after u passes Vi (which means 
7 > 7j). Since u still has to pass Vj there is a path Q connecting u to Vj. By further increasing 7 this 
path will eventually become a single edge (u, Vj), which will flip to (vj, u) when u passes Vj at 7 = 7^. 

Entity x is present at the first vertex of Q (vertex u), and it merges again with path Q at Vj. Clearly, 
this means that Q contains a split vertex w at which x splits from path Q before it can return to Q in 
vertex Vj (see Fig. 14 (b)). 

We now have two paths connecting w to vf the path that x follows and the subpath of Q. We again 
have that by increasing 7 both paths will become singleton edges connecting w to Vj. Eventually both 
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(a) (b) 

Figure 14: The part of 7£ 7 before u encounters v%. The set Xi merges with C e at vertex Vi (a). If x 
merges at both v% and Vj it has to leave (split) at a vertex w in between (b). 

these edges are removed in a collapse event for some 7. If w = u this means (u, Vj) is actually a collapse 
event instead of a passing event. Contradiction. If w / u we have that t w > t u , and therefore 7 < jj. 
The collapse event at 7 will consume both w and Vj, which means u can no longer pass Vj. Contradiction. 
Since both cases yield a contradiction we conclude that the number of passing events involving u is at 
most n. With 0(rn 2 ) vertices this yields the desired bound of 0(rn 3 ) passing events. □ 

Algorithmically, we start with the Reeb graph TZq and examine each edge. Any edge that leads from 
a split node to a merge node and whose duration is at most a is inserted in a priority queue, where the 
duration of the edge is the priority. We handle the encounter events in the correct order, changing the 
Reeb graph and possibly inserting new encounter events in the priority queue. Each event is handled in 
O(logn) time since it involves at most 0(1) priority queue operations. Since there are 0(rn 3 ) events 
(Theorem 5) this takes 0(rn 3 log n) time in total. Once we have the Reeb graph 7£ a / 2 , we can associate 
the trajectories with its edges as before. The computation of the maximal robust groups is done in the 
same way as computing the maximal groups on the normal Reeb graph 1Z. We conclude: 

Theorem 6 Given a set Xofn entities, in which each entity travels along a trajectory of r edges, we 
can compute all robust maximal groups in 0(rn 3 log n + N) time, where N is the output size. 

5 Evaluation 

To see if our model of the grouping structure is practical and indeed captures the grouping behavior of 
entities we implemented and evaluated our algorithms. We would like to visually inspect the maximal 
groups identified by our algorithm, and compare this to our intuition of groups. For a small number of 
(short) trajectories we can still show this in a figure, see for example Fig. 15, which shows the mono- 
tonicity of the maximal groups in size and duration. However, for a larger number of trajectories the 
resulting figures become too cluttered to analyze. So instead we generated short videos. 1 

We use two types of data sets to evaluate our method: a synthetic data set generated using a slightly 
modified version of the NetLogo Flocking model [23, 24], and a real-world data set consisting of deer, 
elk, and cattle, tracked in the Starkey project [17]. 

NetLogo. We generated several data sets using an adapted version of the NetLogo Flocking model [23]. 
In our adapted model the entities no longer wrap around the world border, but instead start to turn when 
they approach the border. Furthermore, we allow small random direction changes for the entities. The 
data set that we consider here contains 400 trajectories, with 818 edges each. Similar to Fig. 15, our 
videos show all maximal groups for varying parameter values. 

The videos show that our model indeed captures the crucial properties of grouping behavior well. We 
notice that the choice of parameter values is important. In particular, if we make e too large we see that 
the entities are loosely coupled, and too many groups are found. Similarly, for large values of m virtually 
no groups are found. However, for reasonable parameter settings, for example e = 5.25, m = 4, and 
5 = 100, we can clearly see that our algorithm identified virtually all sets of entities that travel together. 
Furthermore, if we see a set of entities traveling together that is not identified as group, we indeed see that 
they disperse quickly after they have come together. The coloring of the line-segments also nicely shows 

1 See www.staff.science.uu.nl/~staal006/grouping. 
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Group size: — 2 3 H4 |5 |< | . | 

Figure 15: The maximal groups for varying parameter values. The time associated with each trajectory 
vertex is proportional to its x-coordinate. 

how smaller groups merge into larger ones, and how the larger groups break up into smaller subgroups. 
This is further evidence that our model captures the grouping behavior well. 

Starkey. We also ran our algorithms on a real-world data set, namely on tracking data obtained in the 
Starkey project [17]. This data set captures the movement of deer, elk, and cattle in Starkey, a large forest 
area in Oregon (US), over three years. Not all animals are tracked during the entire period, and positions 
are not reported synchronously for all entities. Thus, we consider only a subset of the data, and resample 
the data such that all trajectories have vertices at the same (regularly spaced) times. We chose a period 
of 30 days for which we have the locations of most of the animals. This yields a data set containing 126 
trajectories with 1264 vertices each. In the Starkey video we can see that a large group of entities quickly 
forms in the center, and then slowly splits into multiple smaller groups. We notice that some entities 
(groups) move closely together, whereas others often stay stationary, or travel separately. 

Running Times. Since we are mainly interested in how well our model captures the grouping behavior, 
we do not extensively evaluate the running times of our algorithms. On our desktop system with a 
AMD Phenom II X2 CPU running at 3.2Ghz our algorithm, implemented in Haskell, computes the 
grouping structure for our data sets in a few seconds. Even for 160 trajectories with roughly 20 thousand 
vertices each we can compute and report all maximal groups in three minutes. Most of the time is 
spent on computing the Reeb graph, in particular on computing the connect/disconnect events. Since 
our implementation uses a slightly easier, yet slower, data structure to represent the maximum weight 
spanning forest during the construction of the Reeb graph, we expect that some speedup is still possible. 

6 Concluding Remarks 

We intr oduced a trajectory grouping structure which uses Reeb graphs and a notion of persistence for ro- 
bustness. We showed how to characterize and efficiently compute the maximal groups and group changes 
in a set of trajectories, and bounded their maximal number. Our paper demonstrates that computational 
topology provides a mathematically sound way to define grouping of moving entities. The complexity 
bounds, algorithms and implementation together form the first comprehensive study of grouping. Our 
videos show that our methods produce results that correspond to human intuition. 

Further work includes more extensive experiments together with domain specialists, such as behavioral 
biologists, to ensure further that the grouping structure captures groups and events in a natural, expected 
way, and changes in the parameters have the desired effect. At the same time, our research may be linked 
to behavioral models of collective motion [20] and provide a (quantifiable) comparison of these. 

We expect that for realistic inputs the size of the grouping structure is much smaller than the worst- 
case bound that we proved. We plan to confirm this in experiments, and to provide faster algorithms 
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under realistic input models. We will also work on improving the visualization of the maximal groups 
and the grouping structure, based on the reduced Reeb graph. 
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A NP-completeness of robust grouping by the first definition 



Theorem 7 Determining whether there is a robust group of size k is NP-complete using the first defini- 
tion of robust groups. 

Proof. We prove this by a reduction from Clique: given a graph G = (V, E) is there a clique of size k? 
Choose e = 0, m < k, S < n + 1, and a = 3/4. We now construct a set of n trajectories, one for each 
vertex, each consisting of 0(n) vertices such that there is a robust group Ron I = [1, n + 1] consisting 
of k entities if and only if G contains a clique R' of size k. The proof idea is similar to that in [10]. 

Let N(v) denote the neighbours of vertex v G V. For each vertex Vi we define five points Pi,cn, bi, Ci, 
and di. Additionally, we define a point p n +i- We assume that all these points (over all vertices) are 
different. Let Sj = (i + 1) — a = i + (1/4) and ti = i + a = i + (3/4) be two times corresponding to 
vertex V{. We now construct an entity /trajectory Xi for each vertex v-i G V such that: 

• at time j, x^ is at pj, 

• at time Sj, Xi is at aj if vi = Vj, and at bj otherwise, 

• at time tj, xi is at Cj if vi G {vj} U N(vj), and at dj otherwise, and 

• at any other time no two entities are at the same place at the same time. 

Fig. 16 shows an example of this construction. 




bi di b 2 d 2 b 3 d 3 6 4 d 4 b 5 d 5 
(a) (b) 

Figure 16: An input graph G = (V, E) (a), the trajectories for G, the x-coordinate of the points corre- 
sponds to the time (b). The trajectory corresponding to is shown in bold. 

Since e is set to zero all entities in a robust group R have to be at the same point in every interval 
of length a. The only times when multiple entities are at the same point are at times i, Si U, with 
1 < i < n + 1. Because £ + 1 — i > ait follows all entities in R have to be together at Si or t{. We now 
select a vertex to be part of the clique R' if and only if the entities in R were not together at time Sj. All 
entities except Xi are together at time Si, so it follows that Xi G R. We then have R' = {vi \ Xi G R}. 

Suppose there is a robust group R of size k on /. We now show that for every pair Vi,Vj G R', vi and 
Vj are neighbours. Hence R' forms a clique (of size k). 

Both vi and Vj are in R', so xi and Xj are in R. Entities Xi and Xj cannot be at the same point at time 
Si since Xi is the only entity on point aj. The same holds for sj. So they must have been together at U 
and tj. In particular, they must have been at points q and Cj, and hence Vi and Vj are neighbours. 

The proof for the other direction, i.e., if R' is a clique in G then R is a robust group, is symmetrical. 
Clearly, the reduction is polynomial. Since it is also easy to check that a given set of entities forms a 
robust group we conclude that the problem is NP-complete. □ 
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